157 research outputs found

    Count ratio model reveals bias affecting NGS fold changes

    Get PDF
    Various biases affect high-throughput sequencing read counts. Contrary to the general assumption, we show that bias does not always cancel out when fold changes are computed and that bias affects more than 20% of genes that are called differentially regulated in RNA-seq experiments with drastic effects on subsequent biological interpretation. Here, we propose a novel approach to estimate fold changes. Our method is based on a probabilistic model that directly incorporates count ratios instead of read counts. It provides a theoretical foundation for pseudo-counts and can be used to estimate fold change credible intervals as well as normalization factors that outperform currently used normalization methods. We show that fold change estimates are significantly improved by our method by comparing RNA-seq derived fold changes to qPCR data from the MAQC/SEQC project as a reference and analyzing random barcoded sequencing data. Our software implementation is freely available from the project website http://www.bio.ifi.lmu.de/software/lfc

    Dickson's Lemma, Higman's Theorem and Beyond: a survey of some basic results in order theory

    Full text link
    We provide proofs for the fact that certain orders have no descending chains and no antichains.Comment: Survey pape

    Count ratio model reveals bias affecting NGS fold changes

    Get PDF
    Various biases affect high-throughput sequencing read counts. Contrary to the general assumption, we show that bias does not always cancel out when fold changes are computed and that bias affects more than 20% of genes that are called differentially regulated in RNA-seq experiments with drastic effects on subsequent biological interpretation. Here, we propose a novel approach to estimate fold changes. Our method is based on a probabilistic model that directly incorporates count ratios instead of read counts. It provides a theoretical foundation for pseudo-counts and can be used to estimate fold change credible intervals as well as normalization factors that outperform currently used normalization methods. We show that fold change estimates are significantly improved by our method by comparing RNA-seq derived fold changes to qPCR data from the MAQC/SEQC project as a reference and analyzing random barcoded sequencing data. Our software implementation is freely available from the project website http://www.bio.ifi.lmu.de/software/lfc

    Algorithmic methods for systems biology of Herpes-viral microRNAs

    Get PDF
    Recent technological advances have made it possible to measure various parameters of biological processes in a genome-wide manner. While traditional molecular biology focusses on individual processes using targeted experiments (reductionistic approach), the field of systems biology utilizes high-throughput experiments to determine the state of a complete system such as a cell at once (holistic approach). Systems biology is not only carried out in wet-lab, but for the most part also requires tailored computational methods. High-throughput experiments are able to produce massive amounts of data, that are often too complex for a human to comprehend directly, that are affected by substantial noise, i.e. random measurement variation, and that are often subject to considerable bias, i.e. systematic deviations of the measurement from the truth. Thus, computer science and statistical methods are necessary for a proper analysis of raw data from such large-scale experiments. The goal of systems biology is to understand a whole system such as a cell in a quantitative manner. Thus, the computational part does not end with analyzing raw data but also involves visualization, statistical analyses, integration and interpretation. One example for these four computational tasks is as follows: Processes in biological systems are often modeled as networks, for instance, gene regulatory networks (GRNs) that represent the interactions of transcription factors (TFs) and their target genes. Experiments can provide both, the identity and wiring of all constituent parts of the network as well as parameters that allow to describe the processes in the system in a quantative manner. A network provides a straight-forward way to visualize the state and processes of a whole system, its statistical analysis can reveal interesting properties of biological systems, it is able to integrate several datasets from various experiments and simulations of the network can aid to interpret the data. In recent years, microRNAs emerged as important contributors to gene regulation in eukaryotes, breaking the traditional dogma of molecular biology, where DNA is transcribed to RNA which is subsequently translated into proteins. MicroRNAs are small RNAs that are not translated but functional as RNAs: They are able to target specific messenger RNAs (mRNA) and typically lead to their downregulation. Thus, in addition to TFs, microRNAs also play important roles in GRNs. Interestingly, not only animal genomes including the human genome encode microRNAs, but microRNAs are also encoded by several pathogens such as viruses. In this work I developed several computational systems biology methods and applied them to high-throughout experimental data in the context of a project about herpes viral microRNAs. Three methods, ALPS, PARma and REA, are designed for the analysis of certain types of raw data, namely short RNA-seq, PAR-CLIP and RIP-Chip data, respectively. All of theses experiments are widely used and my methods are publicly available on the internet and can be utilized by the research community to analyze new datasets. For these methods I developed non-trivial statistical methods (e.g. the EM algorithm kmerExplain in PARma) and implemented and adapted algorithms from traditional computer science and bioinformatics (e.g. alignment of pattern matrices in ALPS). I applied these novel methods to data measured by our cooperation partners in the herpes virus project. I.a., I discovered and investigated an important aspect of microRNA-mediated regulation: MicroRNAs recognize their targets in a context-dependent manner. The widespread impact of context on regulation is widely accepted for transcriptional regulation, and only few examples are known for microRNA-mediated regulation. By integrating various herpes-related datasets, I could show that context-dependency is not restricted to few examples but is a widespread feature in post-transcriptional regulation mediated by microRNAs. Importantly, this is true for both, for human host microRNAs as well as for viral microRNAs. Furthermore, I considered additional aspects in the data measured in the context of the herpes virus project: Alternative splicing has been shown to be a major contributor to protein diversity. Splicing is tightly regulated and possibly important in virus infection. Mass spectrometry is able to measure peptides quantitatively genome-wide in high-throughput. However, no method was available to detect splicing patterns in mass spectrometry data, which was one of the datasets that has been meausred in the project. Thus, I investigated whether mass spectrometry offers the opportunity to identify cases of differential splicing in large-scale. Finally, I also focussed on networks in systems biology, especially on their simulation. To be able to simulate networks for the prediction of the behavior of systems is one of the central goals in computational systems biology. In my diploma thesis, I developed a comprehensive modeling platform (PNMA, the Petri net modeling application), that is able to simulate biological systems in various ways. For highly detailed simulations, I further developed FERN, a framework for stochastic simulation that is not only integrated in PNMA, but also available stand-alone or as plugins for the widely used software tools Cytoscape or CellDesigner. In systems biology, the major bottleneck is computational analysis, not the generation of data. Experiments become cheaper every year and the throughput and diversity of data increases accordingly. Thus, developing new methods and usable software tools is essential for further progress. The methods I have developed in this work are a step into this direction but it is apparent, that more effort must be devoted to keep up with the massive amounts of data that is being produced and will be produced in the future.Der technische Fortschritt in den letzten Jahren hat ermöglicht, dass vielerlei Parameter von biologischen Prozessen genomweit gemessen werden können. Während die traditionelle Molekularbiologie sich mit Hilfe gezielter Experimente auf individuelle Prozesse konzentriert (reduktionistischer Ansatz), verwendet das Feld der Systembiologie Hochdurchsatz-Experimente um den Zustand eines vollständigen Systems wie einer Zelle auf einmal zu bestimmen (holistischer Ansatz). Dabei besteht Systembiologie nicht nur aus Laborarbeit, sondern benötigt zu einem großen Teil auch speziell zurechtgeschnittene computergestützte Methoden. Hochdurchsatz-Experimente können riesige Mengen an Daten produzieren, welche oft zu komplex sind um von einem Menschen direkt verstanden zu werden, welche beeinträchtigt sind von substantiellem Rauschen, das heißt zufälliger Messvariation, und welche oft beträchtlichem Bias unterliegen, also systematischen Abweichungen der Messungen von der tatsächlichen Größe. Daher sind informatische und statistische Methoden notwendig für eine geeignete Analyse der Rohdaten eines groß angelegten systembiologischen Experiments. Das Ziel der Systembiologoe ist ein ganzen System wie eine Zelle in quantitativer Weise zu verstehen. Daher endet der computergestützte Teil nicht mit der Analyse der Rohdaten, sondern beinhaltet ebenfalls Visualisierung, statistische Analyse, Integration und Interpretation. Ein Beispiel dieser vier rechnergestützten Aufgaben ist wie folgt: Prozesse in biologischen Systemen werden oft in Netzwerken modelliert. Zum Beispiel werden in genregulatorischen Netzwerken (GRNs) die Interaktionen zwischen Transkriptionsfaktoren (TFs) und deren Zielgenen repräsentiert. Mit Experimenten kann man sowohl die Identität und die Vernetzung aller Bestandteile des Netzwerkes messen, wie auch die Parameter, mit denen man die Prozesse des Systems in quantitativer Weise beschreiben kann. Mit Hilfe eines Netzwerkes kann man auf einfache und direkte Weise den Zustand und die Prozesse eines ganzen Systems visualisieren, die statistische Analyse des Netzwerks kann interessante Eigenschaften eines biologischen Systems aufdecken, es bietet die Möglichkeit, verschiedene experimentelle Daten zu integrieren und seine Simulation kann bei der Interpretation der Daten helfen. Erst vor wenigen Jahren stelle sich heraus, dass sogenannte microRNAs die Genregulation in Eukaryonten maßgeblich beeinflussen. Das steht im Widersprich zum traditionellen Dogma der Molekularbiologie, bei dem die genetische Information aus der DNA in RNA transkribiert wird, welche anschließend in Proteine translatiert wird. MicroRNAs hingegen sind kurze RNAs, welche nicht translatiert werden, sondern als RNAs funktional sind. Sie können spezifische messenger RNAs (mRNAs) binden und führen dann typischerweise zu deren Inhibition. Zusätzlich zu Transkriptionsfaktoren spielen also microRNAs eine wichtige Rolle in GRNs. Interessanterweise enkodieren nicht nur tierische Genome, das menschliche Genom eingeschlossen, microRNAs, sondern viele Pathogene wie Viren exprimieren ihre eigenen microRNAs in infizierten Wirtszellen. In dieser Arbeit habe ich mehrere computergestützte Methoden für die Anwendung in der Systembiologie entwickelt und auf Hochdurchsatz-Daten angewendet, die im Kontext eines Projektes über herpesvirale microRNAs vermessen wurden. Drei Methoden, ALPS, PARma und REA, habe ich für die Analyse von bestimmten Typen von Rohdaten entworfen, nämlich jeweils short RNA-seq, PAR-CLIP und RIP-Chip. All diese Experimente sind weit verbreitet im Einsatz und meine Methoden sind im Internet öffentlich verfügbar und können von der Forschungsgemeinschaft zur Analyse der Rohdaten der jeweiligen Experimente verwendet werden. Für diese Methoden entwickelte ich nicht-triviale statistische Methoden (z.B. den EM Algorithmus kmerExplain in PARma) und implementierte und adaptierte Algorithmen aus der traditionellen Informatik wie auch aus der Bioinformatik (z.B. Sequenzalignment der Mustermatrizen in ALPS). Ich wendete diese neuen Methoden auf Daten an, die von unseren Kooperationspartner im Herpesviren Projekt gemessenen wurden. Dabei entdeckte und erforschte ich unter anderem einen wichtigen Aspekt der Regulation durch microRNAs: MicroRNAs erkennen ihre Targets in kontext-abhängiger Weise. Die weitverbreiteten Auswirkungen von Kontext ist weithin akzeptiert für transkriptionelle Regulation und es sind nur wenige Beispiele von kontext-spezifischer microRNA gesteuerte Regulation bekannt. Indem ich mehrere Herpes-relevante Datensätze integriert analysiert habe, konnte ich zeigen, dass Kontext-Abhängigkeit nicht nur auf ein paar Beispiele beschränkt ist, sondern dass es ebenfalls ein weitverbreitetes Merkmal der post-transkriptionellen Regulation gesteuert durch microRNAs ist, dass Zielgene kontext-abhängig erkannt werden. Das gilt sowohl für die menschlichen microRNAs der Wirtszelle wie auch für die exogenen viralen microRNAs. Desweiteren habe ich zusätzliche Aspekte der Daten des Herpesviren-Projektes betrachtet: Es wurde gezeigt, dass alternatives Spleißen maßgeblich zur Diversität von Proteinen beiträgt. Spleißen ist streng reguliert und möglicherweise wichtig bei der Virusinfektion. Massenspektrometrie kann Peptide genomweit in quantitativer Weise messen. Allerdings stand keine Methode zur Verfügung, um Spleiß-Muster in Massenspektrometrie-Daten, wie sie im Projekt gemessen wurden, zu detektieren. Aus diesem Grund habe ich untersucht, ob es mit Massenspektrometrie-Daten möglich ist, Fälle von alternativen Spleißen im großen Umfang zu identifizieren. Letztendlich habe ich mich auch auf systembiologische Netzwerke und im Speziellen auf deren Simulation konzentriert. Netzwerke simulieren zu können um das Verhalten von Systemen vorherzusagen ist eines der zentralen Ziele der rechnergestützten Systembiologie. Bereits in meiner Diplomarbeit habe dafür ich eine umfassende Modellierplatform (PNMA, the Petri net modelling application) entwickelt. Damit ist es möglich, biologische Systeme auf vielerlei Arten zu simulieren. Für sehr detailierte Simulationen habe ich dann FERN entwickelt, ein Framework zur stochastischen Simulation, welches nicht nur in PNMA integriert ist, sondern auch als eigenständige Software wie auch also Plugin für die weitverbreiteten Programme Cytoscape und CellDesigner verfügbar ist. Der Engpass in der Systembiologie ist mehr und mehr die rechnergestützte Analyse der Daten und nicht deren Generierung. Experimente werden jedes Jahr günstiger und der Durchsatz und die Diversität der Daten wächst dementsprechend. Daher is es für den weiteren wissenschaftlichen Fortschritt essentiell, neue Methoden und benutzbare Softwarepakete zu entwickeln. Die Methoden, die ich in dieser Arbeit entwickelt habe, stellen einen Schritt in diese Richtung dar, aber es ist offensichtlich, dass mehr Anstrengungen aufgewendet werden müssen, um Schritt halten zu können mit den riesigen Mengen an Daten die produziert werden und in der Zukunft noch produziert werden

    FERN – a Java framework for stochastic simulation and evaluation of reaction networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Stochastic simulation can be used to illustrate the development of biological systems over time and the stochastic nature of these processes. Currently available programs for stochastic simulation, however, are limited in that they either a) do not provide the most efficient simulation algorithms and are difficult to extend, b) cannot be easily integrated into other applications or c) do not allow to monitor and intervene during the simulation process in an easy and intuitive way. Thus, in order to use stochastic simulation in innovative high-level modeling and analysis approaches more flexible tools are necessary.</p> <p>Results</p> <p>In this article, we present FERN (Framework for Evaluation of Reaction Networks), a Java framework for the efficient simulation of chemical reaction networks. FERN is subdivided into three layers for network representation, simulation and visualization of the simulation results each of which can be easily extended. It provides efficient and accurate state-of-the-art stochastic simulation algorithms for well-mixed chemical systems and a powerful observer system, which makes it possible to track and control the simulation progress on every level. To illustrate how FERN can be easily integrated into other systems biology applications, plugins to Cytoscape and CellDesigner are included. These plugins make it possible to run simulations and to observe the simulation progress in a reaction network in real-time from within the Cytoscape or CellDesigner environment.</p> <p>Conclusion</p> <p>FERN addresses shortcomings of currently available stochastic simulation programs in several ways. First, it provides a broad range of efficient and accurate algorithms both for exact and approximate stochastic simulation and a simple interface for extending to new algorithms. FERN's implementations are considerably faster than the C implementations of gillespie2 or the Java implementations of ISBJava. Second, it can be used in a straightforward way both as a stand-alone program and within new systems biology applications. Finally, complex scenarios requiring intervention during the simulation progress can be modelled easily with FERN.</p

    Electrically and Magnetically Charged States and Particles in the 2+1-Dimensional Z_N-Higgs Gauge Model

    Get PDF
    Electrically as well as magnetically charged states are constructed in the 2+1-dimensional Euclidean Z_N-Higgs lattice gauge model, the former following ideas of Fredenhagen and Marcu and the latter using duality transformations on the algebra of observables. The existence of electrically and of magnetically charged particles is also established. With this work we prepare the ground for the constructive study of anyonic statistics of multiparticle scattering states of electrically and magnetically charged particles in this model (work in progress).Comment: 57 pages, Sfb 288 Preprint No. 109. To appear in Commun. Math. Phys. About the file: This is a uuencoded, "gzip-ed" postscript file. It is about 300kB large. The original ps file is about 700kB large. All figures are included. The LaTeX sources ou even hard copies can be required to the authors at [email protected] or Freie Universitaet Berlin. Institut fuer Theoretische Physik. Arnimallee 14. Berlin 14195 German

    Widespread disruption of host transcription termination in HSV-1 infection.

    Get PDF
    Herpes simplex virus 1 (HSV-1) is an important human pathogen and a paradigm for virus-induced host shut-off. Here we show that global changes in transcription and RNA processing and their impact on translation can be analysed in a single experimental setting by applying 4sU-tagging of newly transcribed RNA and ribosome profiling to lytic HSV-1 infection. Unexpectedly, we find that HSV-1 triggers the disruption of transcription termination of cellular, but not viral, genes. This results in extensive transcription for tens of thousands of nucleotides beyond poly(A) sites and into downstream genes, leading to novel intergenic splicing between exons of neighbouring cellular genes. As a consequence, hundreds of cellular genes seem to be transcriptionally induced but are not translated. In contrast to previous reports, we show that HSV-1 does not inhibit co-transcriptional splicing. Our approach thus substantially advances our understanding of HSV-1 biology and establishes HSV-1 as a model system for studying transcription termination.This work was supported by MRC Fellowship grant G1002523 and NHSBT grant WP11-05 to LD, and DFG grant FR2938/1–2 to C.C.F. We thank Viv Connor for excellent technical assistance and Professor Rozanne Sandri-Goldin (University of California) for the ΔICP27 mutant and complementing cell line. The support of the Cluster of Excellence (Nucleotide lab) to P.R. is acknowledged.This is the final version of the article. It first appeared from NPG via http://dx.doi.org/10.1038/ncomms812

    Red Blood Cell Contamination of the Final Cell Product Impairs the Efficacy of Autologous Bone Marrow Mononuclear Cell Therapy

    Get PDF
    ObjectivesThe aim of this study was to identify an association between the quality and functional activity of bone marrow-derived progenitor cells (BMCs) used for cardiovascular regenerative therapies and contractile recovery in patients with acute myocardial infarction included in the placebo-controlled REPAIR-AMI (Reinfusion of Enriched Progenitor cells And Infarct Remodeling in Acute Myocardial Infarction) trial.BackgroundIsolation procedures of autologous BMCs might affect cell functionality and therapeutic efficacy.MethodsQuality of cell isolation was assessed by measuring the total number of isolated BMCs, CD34+ and CD133+ cells, their colony-forming unit (CFU) and invasion capacity, cell viability, and contamination of the final BMC preparation with thrombocytes and red blood cells (RBCs).ResultsThe number of RBCs contaminating the final cell product significantly correlated with reduced recovery of left ventricular ejection fraction 4 months after BMC therapy (p = 0.007). Higher numbers of RBCs in the BMC preparation were associated with reduced BMC viability (r = −0.23, p = 0.001), CFU capacity (r = −0.16, p = 0.03), and invasion capacity (r = −0.27, p < 0.001). To assess a causal role for RBC contamination, we coincubated isolated BMCs with RBCs for 24 h in vitro. The addition of RBCs dose-dependently abrogated migratory capacity (p = 0.003) and reduced CFU capacity (p < 0.05) of isolated BMCs. Neovascularization capacity was significantly impaired after infusion of BMCs contaminated with RBCs, compared with BMCs alone (p < 0.05). Mechanistically, the addition of RBCs was associated with a profound reduction in mitochondrial membrane potential of BMCs.ConclusionsContaminating RBCs affects the functionality of isolated BMCs and determines the extent of left ventricular ejection fraction recovery after intracoronary BMC infusion in patients with acute myocardial infarction. These results suggest a bioactivity response relationship very much like a dose–response relationship in drug trials. (Reinfusion of Enriched Progenitor cells and Infarct Remodeling in Acute Myocardial Infarction [REPAIR-AMI]; NCT00279175

    Kassiopeia: A Modern, Extensible C++ Particle Tracking Package

    Full text link
    The Kassiopeia particle tracking framework is an object-oriented software package using modern C++ techniques, written originally to meet the needs of the KATRIN collaboration. Kassiopeia features a new algorithmic paradigm for particle tracking simulations which targets experiments containing complex geometries and electromagnetic fields, with high priority put on calculation efficiency, customizability, extensibility, and ease of use for novice programmers. To solve Kassiopeia's target physics problem the software is capable of simulating particle trajectories governed by arbitrarily complex differential equations of motion, continuous physics processes that may in part be modeled as terms perturbing that equation of motion, stochastic processes that occur in flight such as bulk scattering and decay, and stochastic surface processes occuring at interfaces, including transmission and reflection effects. This entire set of computations takes place against the backdrop of a rich geometry package which serves a variety of roles, including initialization of electromagnetic field simulations and the support of state-dependent algorithm-swapping and behavioral changes as a particle's state evolves. Thanks to the very general approach taken by Kassiopeia it can be used by other experiments facing similar challenges when calculating particle trajectories in electromagnetic fields. It is publicly available at https://github.com/KATRIN-Experiment/Kassiopei

    Widespread context dependency of microRNA-mediated regulation

    Get PDF
    Gene expression is regulated in a context-dependent, cell-type specific manner. Condition-specific transcription is dependent on the presence of transcription factors (TFs) that can activate or inhibit its target genes (global context). Additional factors such as chromatin structure, histone or DNA modifications also influence the activity of individual target genes (individual context). The role of the global and individual context for post-transcriptional regulation has not systematically been investigated on a large-scale and is poorly understood. Here we show that global and individual context-dependency is a pervasive feature of microRNA-mediated regulation. Our comprehensive and highly consistent dataset from several high-throughput technologies (PAR-CLIP, RIP-Chip, 4sU-tagging and SILAC) provides strong evidence that context-dependent microRNA target sites (CDTS) are as frequent and functionally relevant as constitutive target sites (CTS). Furthermore, we found the global context to be insufficient to explain the CDTS and that flanking sequence motifs provide individual context that is an equally important factor. Our results demonstrate that, similar to TF-mediated regulation, global and individual context-dependency are prevalent in microRNA-mediated gene regulation implying a much more complex post-transcriptional regulatory network than currently known. The necessary tools to unravel post-transcriptional regulations and mechanisms need to be much more involved and much more data will be needed for particular cell types and cellular conditions to understand microRNA-mediated regulation and the context-dependent post-transcriptional regulatory network
    corecore